2.0 CCLE Gene Expression

This notebook uses Clustergrammer2 to visualize the Cancer cell line Encyclopedia gene expression data (data obtained from the Broad-Institute). The CCLE project measured genetic data from over 1000 cancer cell lines and provides cell line annotations (e.g. tissue) that is used to generate cell type categories.

In [1]:
from clustergrammer2 import net
import warnings
warnings.filterwarnings('ignore')
>> clustergrammer2 backend version 0.5.1
In [2]:
import pandas as pd
df = pd.read_csv('../data/CCLE/CCLE.txt.gz', compression='gzip', index_col=0)
In [3]:
from ast import literal_eval as make_tuple
cols = df.columns.tolist()
new_cols = [make_tuple(x) for x in cols]
df.columns = new_cols

CCLE Gene Expression Data

In [4]:
net.load_df(df)
net.filter_N_top(inst_rc='row', N_top=1000, rank_type='var')
net.load_df(net.export_df().round(2))
net.widget()

CCLE Gene Expression Data, Z-score Genes

In [5]:
net.load_df(df)
net.filter_N_top(inst_rc='row', N_top=1000, rank_type='var')
net.normalize(axis='row', norm_type='zscore')
net.load_df(net.export_df().round(2))
net.widget()
In [ ]: